index.md (2510B)
1 +++ 2 title = 'Shared-memory multiprocessors' 3 +++ 4 # Shared-memory multiprocessors 5 multiprocessor system has a lot of processors that can do different tasks at the same time 6 7 in a shared-memory multiprocessor, all processors have access to the same memory (probably large) 8 9 memory is distributed across multiple modules, connected by an interconnection network 10 11 when memory is physically separate processors, all requests go through a network, introducing latency 12 13 if you have the same latency for memory access from all processors, you have a Uniform Memory Access (UMA) multiprocessor (but latency doesn’t magically go away) 14 15 to improve performance, put a memory module next to each processor 16 leads to collection of “nodes”, each with a processor and memory module 17 18 each node is connected to network. no network latency when memory request is local, but if remote, it has to go through the network 19 20 these are Non-Uniform Memory Access ([NUMA](https://youtu.be/jRx5PrAlUdY?t=1m39s)) processors 21 22 ![screenshot.png](screenshot-25.png) 23 24 ## Interconnection networks 25 suitability is judged in terms of: 26 27 - bandwidth — capacity of a transmission link to transfer data (bits or bytes per second) 28 - effective throughput — actual rate of data transfer 29 - packets — form of data (fixed length and specified format, ideally handled in one clock cycle) 30 31 types commonly used: 32 33 - buses — set of wires that provide a single shared path for info transfer 34 - suitable for small number of processors (low contention) 35 - does not allow new request until the response for the current request is provided 36 - alternative is split-transaction bus, where request and response can have other events in between them 37 - ring — point-to-point connections between nodes 38 - low-latency option 1: bidirectional ring 39 - halves latency, doubles bandwidth 40 - increases complexity 41 - low-latency option 2: hierarchy of rings 42 - upper-level ring connects lower-level rings 43 - average latency is reduced 44 - upper-level ring may become a bottleneck if low-level rings communicate frequently 45 - crossbar — direct link between any pair of units 46 - used in UMA multiprocessors to connect processors to memory modules 47 - enables many simultaneous transfers, if one destination doesn’t get multiple requests 48 - mesh — like a net over all nodes 49 - each node connects to its horizontal and vertical neighbours 50 - wraparound connections can be introduced at edges — “torus”